Selectivity Options for Age-Structured Stock Assessments

A comparative framework using RTMB

Author
Affiliation

James Ianelli

NOAA/NMFS Alaska Fisheries Science Center

Published

February 13, 2026

Abstract

Selectivity is a fundamental component of age-structured stock assessment models, governing how fishing mortality and survey catchability vary with age or size. The choice of selectivity parameterization affects estimates of population abundance, fishing mortality, and management reference points. This manuscript compares several selectivity formulations—including standard logistic, double logistic, spline-based, and time-varying 2D autoregressive approaches—within a unified RTMB framework. We examine parameter identifiability, prior sensitivity, and the practical consequences of selectivity misspecification using simulation and application to eastern Bering Sea walleye pollock (Gadus chalcogrammus). Results highlight trade-offs between flexibility and estimability and provide guidance for practitioners choosing among selectivity options in operational assessments.

Keywords

selectivity, stock assessment, RTMB, fisheries, walleye pollock

0.1 Introduction

Selectivity functions describe the relative vulnerability of fish to capture as a function of age (or size) and are among the most influential—yet least observable—components of stock assessment models. The assumed shape of the selectivity curve directly affects estimates of spawning biomass, recruitment, and fishing mortality, and misspecification can propagate into biased reference points and management advice (Punt, Hurtado-Ferro, and Whitten 2013; Thompson 1994).

Despite its importance, selectivity is often treated as a modelling convenience rather than a biological or technological quantity to be estimated carefully. Most operational assessments adopt a logistic or double-logistic functional form, chosen for parsimony rather than fidelity to the underlying catch process. More flexible alternatives—such as penalized splines or non-parametric approaches—can reduce structural bias but may introduce identifiability problems, particularly when data are sparse or conflicting.

This manuscript develops a comparative framework for evaluating selectivity options within the R Template Model Builder (RTMB) environment. RTMB provides automatic differentiation, Laplace approximation for random effects, and integration with Bayesian sampling via SparseNUTS, making it a natural platform for exploring both maximum likelihood and posterior-based inference on selectivity parameters.

We organize the comparison around four selectivity families:

  1. Standard logistic (Section 0.4): the two-parameter ascending logistic, widely used for fisheries where retention is assumed to be monotonically increasing with age.
  2. Double logistic (Section 1.5): a dome-shaped three-parameter formulation allowing selectivity to decline at older ages, motivated by gear avoidance, ontogenetic habitat shifts, or differential availability.
  3. Spline-based (Section 2.9): a penalized B-spline approach offering flexible, data-driven selectivity shapes with smoothness controlled by a penalty parameter.
  4. Time-varying 2D AR1 (Section 3.6): a separable autoregressive structure over year and age dimensions, allowing selectivity to evolve smoothly through time while maintaining age-specific coherence via a Kronecker-structured precision matrix.

For each family, we present the mathematical formulation, an RTMB implementation with prior specification, parameter sensitivity analysis, and MCMC-based posterior evaluation. We then apply all to eastern Bering Sea walleye pollock data to illustrate practical trade-offs.

0.2 Review Scope and Literature Context

This review focuses on methodological literature that is directly relevant to three practical assessment decisions: (i) choice of selectivity functional form, (ii) model selection among competing selectivity hypotheses, and (iii) construction of time-varying selectivity processes with explicit regularization.

0.2.1 Selectivity form choice in integrated assessments

Operationally, selectivity in integrated assessments is best interpreted as the combined effect of gear contact/retention and fish availability, not as a purely mechanistic gear curve (Punt, Hurtado-Ferro, and Whitten 2013; Privitera-Johnson, Methot, and Punt 2022). In that setting, the choice of selectivity parameterization is consequential because it changes inference on spawning biomass, recruitment, and fishing mortality.

Selectivity-form selection therefore requires both statistical and biological diagnostics, including residual pattern checks, sensitivity analyses, and information-based comparisons among plausible alternatives (Punt, Hurtado-Ferro, and Whitten 2013). A key warning is that non-monotone selectivity can be strongly confounded with natural mortality, particularly when older ages are weakly informed by composition data (Thompson 1994).

0.2.2 Gear-selectivity estimation foundations

Classical selectivity-estimation literature remains important for assessment modeling because it clarifies what selectivity data can identify and where latent availability effects remain unresolved (Millar and Fryer 1999; Wileman et al. 1996). This evidence base supports explicit regularization whenever flexible age-specific selectivity surfaces are estimated.

0.2.3 Time-varying selectivity literature

Simulation studies show that time-varying selectivity can materially improve or degrade assessment performance depending on how process flexibility matches the data-generating mechanism (Linton and Bence 2011). Practitioner guidance also emphasizes that subjective choices such as block boundaries, smoothing penalties, and process assumptions can dominate outcomes (Martell and Stewart 2014).

Recent semi-parametric approaches formalize autocorrelation across both age and time, motivating separable latent-process structures that can be estimated within modern AD frameworks (Xu et al. 2019). Broader SCA model-selection and time-varying process work reinforces the same bias-variance trade-off (Wilberg and Bence 2006, 2008).

0.3 Statistical Framing as a Latent Process

Although selectivity is often presented as a deterministic curve, it is more usefully viewed as a latent process that allocates fishing intensity over age and time. Under this framing, selecting a functional form is equivalent to selecting a prior precision structure on an unobserved selectivity surface.

Low-dimensional parametric forms (for example, logistic and double logistic) impose strong structural priors and are robust when data are sparse. Flexible forms (for example, spline and time-varying AR structures) reduce structural bias risk, but require explicit regularization to prevent confounding and overfitting (Martell and Stewart 2014; Xu et al. 2019).

This perspective motivates three principles for applied assessments:

  1. Match effective selectivity complexity to data information content.
  2. Treat regularization (penalties/priors/precision structures) as essential, not optional.
  3. Use model-selection tools (AIC, DIC, cross-validation, posterior predictive checks) to evaluate bias-variance trade-offs under explicit prior structure (Punt, Hurtado-Ferro, and Whitten 2013; Wilberg and Bence 2008).

Within this unifying view, logistic, dome-shaped, spline, blocked, and AR(1) approaches differ primarily in the precision matrix implied for latent selectivity deviations, rather than only in curve geometry.

0.4 Standard Logistic Selectivity

1 Standard Logistic Selectivity

Two-parameter ascending logistic formulation

1.1 Overview

The standard (ascending) logistic selectivity is the simplest and most commonly used parameterization in age-structured stock assessments. It assumes that selectivity increases monotonically with age and asymptotes at full selectivity. Its main advantage is parsimony: a small number of parameters can often stabilize inference when composition data are limited [Methot and Wetzel (2013); Punt, Hurtado-Ferro, and Whitten (2013)].

As a consequence, logistic selectivity is frequently used as a baseline candidate in model-comparison workflows. However, when older fish exhibit reduced vulnerability, strict monotonicity can induce structural bias and shift variation into other parameters [Thompson (1994)].

1.2 Model definition

With parameters \(a_{50}\) (age at 50% selectivity) and \(\delta\) (slope width from 50% to 95% selectivity):

\[ \text{sel}(a) = \frac{1}{1 + \exp\left[-\log(19)\,\frac{a - a_{50}}{\delta}\right]} \]

where \(\delta > 0\) controls how quickly selectivity transitions from low to high values.

1.3 Scenario exploration

Figure 1: Standard logistic selectivity under different parameterizations.

1.4 RTMB implementation

Placeholder: RTMB objective function with priors on \(a_{50}\) and \(\delta\), optimization, and MCMC sampling.

Show R code
library(RTMB)

# Objective function skeleton
f <- function(parms) {
  getAll(data, parms, warn = FALSE)
  a50 <- exp(log_a50)
  delta <- exp(log_delta)
  sel_hat <- logistic_sel(age, a50, delta)

  nll <- 0
  # Priors
  nll <- nll - dnorm(log_a50, mu_log_a50, sd_log_a50, log = TRUE)
  nll <- nll - dnorm(log_delta, mu_log_delta, sd_log_delta, log = TRUE)

  ADREPORT(c(a50 = a50, delta = delta))
  nll
}
Source: Standard Logistic Selectivity

1.5 Double Logistic Selectivity

2 Double Logistic Selectivity (3-parameter formulation)

Parameterization, sensitivity, and prior elicitation via RTMB

2.1 Overview

This notebook documents the 3-parameter double logistic selectivity curve used for prior elicitation in the EBS pollock assessment workflow. The formulation allows dome-shaped selectivity, where vulnerability increases with age to a peak and then declines. This behavior is motivated by gear avoidance at older ages, ontogenetic habitat shifts, or differential availability between age classes.

Within integrated assessments, double-logistic forms are often a practical extension of monotone logistic selectivity because they can represent declining vulnerability at older ages while remaining low-dimensional [Punt, Hurtado-Ferro, and Whitten (2013)]. Interpretation still requires care: dome shape can be confounded with natural mortality and cohort processes if independent information is weak [Thompson (1994)].

2.2 Model definition

Let age be \(a\), with parameters \(p_1\), \(p_2\), and \(p_3\) and derived inflection points \(\gamma_1\) and \(\gamma_2\):

\[ \gamma_1 = p_1 + p_2, \qquad \gamma_2 = 2p_1 + p_2 + p_3. \]

The ascending limb is a standard logistic and the descending limb is one minus a logistic:

\[ \text{asc}(a) = \frac{1}{1 + \exp\left[-\log(19)\,\frac{a - \gamma_1}{p_1}\right]}, \]

\[ \text{desc}(a) = 1 - \frac{1}{1 + \exp\left[-\log(19)\,\frac{a - \gamma_2}{p_3}\right]}. \]

The full double logistic selectivity is

\[ \text{sel}(a) = \min\left(1,\ \frac{\text{asc}(a)\,\text{desc}(a)}{0.95^2}\right). \]

2.2.1 Interpretation

  • \(p_1 > 0\) controls the ascending slope and is the distance from \(\gamma_1\) (50% selectivity) to the 95% point.
  • \(p_2\) shifts the ascending limb through \(\gamma_1 = p_1 + p_2\).
  • \(p_3 > 0\) controls the descending slope and is the distance from \(\gamma_2\) (50% on the descending limb) to the 5% point.
  • The dome width is \(\gamma_2 - \gamma_1 = p_1 + p_3\).
  • The factor \(0.95^{-2}\) normalizes the curve so that when both limbs are at 0.95 the product is near 1 (then capped at 1.0).

2.3 Scenarios

Table 1: Scenario parameters and derived inflection points
scenario p1 p2 p3 gamma1 gamma2 dome_width
Wide dome 2.0 4 2.5 6.0 10.5 4.5
Asymmetric
(fast rise, slow decline) 0.8 4 3.0 4.8 8.6 3.8
Asymmetric
(slow rise, fast decline) 2.0 3 0.8 5.0 7.8 2.8
Nearly asymptotic 1.5 4 8.0 5.5 15.0 9.5

The scenario summary in Table 1 lists the four parameter sets and the derived inflection points (\(\gamma_1\), \(\gamma_2\)) used in the selectivity scenarios.

2.4 Faceted scenario curves

Each scenario appears on its own panel in Figure 2 so the ascent, descent, and dome width can be compared against the 50% and 95% reference lines.

Figure 2: Double logistic selectivity for each parameter scenario.

2.5 Overlay comparison

All scenarios are overlaid in Figure 3 to highlight differences in peak age and decline shape when the curves are viewed on a common axis.

Figure 3: All scenarios overlaid for direct comparison.

2.6 Parameter sensitivity analysis

The three panels in Figure 4 show how each parameter changes the curve when the other two are held fixed.

Figure 4: Sensitivity to p1, p2, and p3 with other parameters fixed.

2.7 RTMB implementation with priors

A common approach is to estimate on the log scale for the positive parameters and apply lognormal priors on \(p_1\) and \(p_3\):

\[ \log(p_k) \sim \mathcal{N}(\mu_k, \sigma_k^2), \qquad k \in \{1, 3\}. \]

If using a lognormal prior specified by median \(m\) and coefficient of variation \(\text{CV}\) on the natural scale, a convenient conversion is

\[ \mu = \log(m), \qquad \sigma = \sqrt{\log(\text{CV}^2 + 1)}. \]

Parameter \(p_2\) can remain unconstrained (normal prior) or be modeled on the log scale if you want to enforce \(p_2 > 0\).

Show R code
library(RTMB)

dbl_logistic_sel <- function(age, p1, p2, p3) {
  gamma1 <- p1 + p2
  gamma2 <- 2 * p1 + p2 + p3
  asc <- 1 / (1 + exp(-log(19) * (age - gamma1) / p1))
  desc <- 1 - 1 / (1 + exp(-log(19) * (age - gamma2) / p3))
  (asc * desc * 0.95^(-2))
}

lognorm_from_median_cv <- function(median, cv) {
  sigma <- sqrt(log(cv^2 + 1))
  mu <- log(median)
  list(mu = mu, sigma = sigma)
}

age <- seq(1, 15, by = 0.25)
sel_obs <- c(
  0, 0, 0, 0, 0, 0, 0, 0.01, 0.01, 0.037142857, 0.064285714, 0.091428571,
  0.118571429, 0.145714286, 0.172857143, 0.2, 0.205882353, 0.211764706,
  0.217647059, 0.223529412, 0.229411765, 0.235294118, 0.241176471,
  0.247058824, 0.252941176, 0.258823529, 0.264705882, 0.270588235,
  0.276470588, 0.282352941, 0.288235294, 0.294117647, 0.3, 0.31, 0.32,
  0.33, 0.34, 0.35, 0.36, 0.37, 0.38, 0.39, 0.4, 0.41, 0.42, 0.43,
  0.44, 0.45, 0.46, 0.47, 0.48, 0.49, 0.5, 0.5125, 0.525, 0.5375, 0.55,
  0.5625, 0.575, 0.5875, 0.6, 0.6125, 0.625, 0.6375, 0.65, 0.6625,
  0.675, 0.6875, 0.7, 0.7125, 0.725, 0.7375, 0.75, 0.7625, 0.775,
  0.7875, 0.8, 0.807894737, 0.815789474, 0.823684211, 0.831578947,
  0.839473684, 0.847368421, 0.855263158, 0.863157895, 0.871052632,
  0.878947368, 0.886842105, 0.894736842, 0.902631579, 0.910526316,
  0.918421053, 0.926315789, 0.934210526, 0.942105263, 0.95, 0.954545455,
  0.959090909, 0.963636364, 0.968181818, 0.972727273, 0.977272727,
  0.981818182, 0.986363636, 0.990909091, 0.995454545, 1, 1, 1, 1, 1, 1,
  1, 0.989285714, 0.978571429, 0.967857143, 0.957142857, 0.946428571,
  0.935714286, 0.925, 0.914285714, 0.903571429, 0.892857143, 0.882142857,
  0.871428571, 0.860714286, 0.85, 0.839285714, 0.828571429, 0.817857143,
  0.807142857, 0.796428571, 0.785714286, 0.775, 0.764285714, 0.753571429,
  0.742857143, 0.732142857, 0.721428571, 0.710714286, 0.7
)
sel_sd <- rep(0.05, length(age))

p1_prior <- lognorm_from_median_cv(median = 1.5, cv = 0.4)
p3_prior <- lognorm_from_median_cv(median = 2.5, cv = 0.6)
p2_prior <- list(mu = 4, sigma = 1)

make_prior_data <- function(p1_prior, p2_prior, p3_prior) {
  list(
    age = age,
    sel_obs = sel_obs,
    sel_sd = sel_sd,
    mu_log_p1 = p1_prior$mu,
    sd_log_p1 = p1_prior$sigma,
    mu_p2 = p2_prior$mu,
    sd_p2 = p2_prior$sigma,
    mu_log_p3 = p3_prior$mu,
    sd_log_p3 = p3_prior$sigma
  )
}

data <- make_prior_data(p1_prior, p2_prior, p3_prior)

parameters <- list(
  log_p1 = log(1.5),
  p2 = 4,
  log_p3 = log(2.5)
)

make_obj <- function(data, parameters) {
  f <- function(parms) {
    getAll(data, parms, warn = FALSE)
    p1 <- exp(log_p1)
    p3 <- exp(log_p3)
    sel_hat <- dbl_logistic_sel(age, p1, p2, p3)

    nll <- 0
    nll <- nll - dnorm(log_p1, mu_log_p1, sd_log_p1, log = TRUE)
    nll <- nll - dnorm(p2, mu_p2, sd_p2, log = TRUE)
    nll <- nll - dnorm(log_p3, mu_log_p3, sd_log_p3, log = TRUE)

    ADREPORT(c(p1 = p1, p2 = p2, p3 = p3))
    nll
  }

  MakeADFun(f, parameters)
}

obj <- make_obj(data, parameters)
opt <- nlminb(obj$par, obj$fn, obj$gr)
outer mgc:  0 outer mgc:  0 
outer mgc:  0.006737636 
outer mgc:  0.006737636 
outer mgc:  0.001 
outer mgc:  0.001 
outer mgc:  0.003252194 
outer mgc:  0.003252194 
outer mgc:  2.5 

2.8 MCMC evaluation

Show R code
library(SparseNUTS)


Gradient evaluation took 3.6e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.36 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.019 seconds (Warm-up)
               0.13 seconds (Sampling)
               0.149 seconds (Total)



Model 'selectivity_set1' has 3 pars, and was fit using NUTS with a 'diag' metric
1 chain(s) of 1300 total iterations (150 warmup) were used
Average run time per chain was 0.15 seconds 
Minimum ESS=360.7 (31.37%), and maximum Rhat=1.015
There were 0 divergences after warmup
Figure 5: Marginal posterior distributions for selectivity parameters (SparseNUTS, prior set 1).

The marginal posterior densities in Figure 5 summarize \((p_1, p_2, p_3)\) under prior set 1.

Figure 6: Pairwise posterior scatterplots for selectivity parameters.
Figure 7: MCMC selectivity draws for prior set 1 (p1 median 1.5, p3 median 2.5, p2 sd 1).

2.8.1 Additional prior sets



Gradient evaluation took 2.3e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.23 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.017 seconds (Warm-up)
               0.126 seconds (Sampling)
               0.143 seconds (Total)



Model 'selectivity_set2' has 3 pars, and was fit using NUTS with a 'diag' metric
1 chain(s) of 1300 total iterations (150 warmup) were used
Average run time per chain was 0.14 seconds 
Minimum ESS=393.8 (34.24%), and maximum Rhat=1.003
There were 0 divergences after warmup


Gradient evaluation took 2.7e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.27 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.017 seconds (Warm-up)
               0.135 seconds (Sampling)
               0.152 seconds (Total)



Model 'selectivity_set3' has 3 pars, and was fit using NUTS with a 'diag' metric
1 chain(s) of 1300 total iterations (150 warmup) were used
Average run time per chain was 0.15 seconds 
Minimum ESS=427 (37.13%), and maximum Rhat=1.005
There were 0 divergences after warmup
Figure 8: MCMC selectivity draws for prior set 2 (p1 median 1.5, p3 median 8, p2 sd 1).
Figure 9: MCMC selectivity draws for prior set 3 (p1 median 1.5, p3 median 8, p2 sd 2).
Source: Double Logistic Selectivity (3-parameter formulation)

2.9 Spline-Based Selectivity

3 Spline-Based Selectivity

Penalized B-spline selectivity with smoothness control

3.1 Overview

Spline-based selectivity provides a flexible, semi-parametric alternative to logistic functional forms. Rather than imposing a specific shape (monotone or dome-shaped), a B-spline basis is used to represent selectivity as a smooth function of age, with the degree of smoothness controlled by a penalty on the second differences of the spline coefficients.

This formulation is consistent with a broader selectivity-estimation literature that emphasizes flexible shape representation together with explicit regularization [Millar and Fryer (1999); Wileman et al. (1996)]. In assessment applications, spline penalties act as the key control on effective complexity and therefore on bias-variance trade-offs [Martell and Stewart (2014)].

3.2 Model definition

Let \(\mathbf{B}(a)\) be a B-spline basis matrix evaluated at ages \(a\), with \(K\) basis functions and coefficient vector \(\boldsymbol{\beta}\). The log-selectivity is

\[ \log\,\text{sel}(a) = \mathbf{B}(a)\,\boldsymbol{\beta} \]

with selectivity normalized so the maximum equals 1:

\[ \text{sel}(a) = \frac{\exp(\mathbf{B}(a)\,\boldsymbol{\beta})} {\max_a \exp(\mathbf{B}(a)\,\boldsymbol{\beta})}. \]

Smoothness is enforced via a penalty on second differences:

\[ \text{penalty} = \frac{\lambda}{2} \sum_{k=3}^{K} (\beta_k - 2\beta_{k-1} + \beta_{k-2})^2 \]

where \(\lambda\) controls the trade-off between fit and smoothness.

3.3 Basis visualization

Figure 10: B-spline basis functions for selectivity modeling (K = 6 knots).

3.4 Example curves

Figure 11: Example spline selectivity curves with different coefficient vectors.

3.5 RTMB implementation

Placeholder: RTMB objective with spline coefficients as parameters, smoothness penalty as a tuning parameter or estimated, and MCMC sampling.

Show R code
library(RTMB)

# Objective function skeleton
f <- function(parms) {
  getAll(data, parms, warn = FALSE)
  log_sel <- B %*% beta
  sel_hat <- exp(log_sel - max(log_sel))

  # Second-difference penalty
  K <- length(beta)
  d2 <- beta[3:K] - 2 * beta[2:(K-1)] + beta[1:(K-2)]
  penalty <- 0.5 * exp(log_lambda) * sum(d2^2)

  nll <- penalty
  # Add data likelihood here
  # nll <- nll - sum(dnorm(sel_obs, sel_hat, sel_sd, log = TRUE))

  ADREPORT(sel_hat)
  nll
}
Source: Spline-Based Selectivity

3.6 Time-Varying Selectivity with 2D Autoregressive Structure

4 Time-Varying Selectivity with 2D Autoregressive Structure

Separable AR1 processes in year and age via RTMB

4.1 Overview

In many fisheries, selectivity is not constant over time. Changes in fishing gear, spatial effort allocation, and fish distribution can all shift the age-specific vulnerability pattern across years. A time-varying selectivity model captures these dynamics by allowing log-selectivity deviations to vary over a year \(\times\) age matrix, with temporal and ontogenetic smoothness enforced through a separable two-dimensional autoregressive prior.

This formulation uses RTMB’s dautoreg and dseparable functions to construct a Kronecker-structured precision matrix, providing a computationally efficient and differentiable prior for the year–age deviation surface.

4.2 Model definition

4.2.1 Selectivity-at-age by year

Let \(f\) index fishery, \(y\) index year, and \(a\) index age. For each fishery, a matrix of log-selectivity deviations \(\boldsymbol{\epsilon}_f\) is defined over years \(y = 1, \ldots, Y\) and estimated ages \(a = a_{\min}, \ldots, a_{\max}\):

\[ \log s_{f,y,a} = \epsilon_{f,y,a} \]

The realized selectivity is obtained by exponentiating and normalizing so that the mean across ages within each year equals 1:

\[ s_{f,y,a} = \frac{\exp(\epsilon_{f,y,a})} {\frac{1}{a_{\max} - a_{\min} + 1}\sum_{a'=a_{\min}}^{a_{\max}} \exp(\epsilon_{f,y,a'})} \] For ages beyond \(a_{\max}\), selectivity may be held constant at the value for \(a_{\max}\) (a “plus-group” extension):

\[ s_{f,y,a} = s_{f,y,a_{\max}}, \qquad a > a_{\max}. \]

4.2.2 Block structure for change years

In practice, selectivity is not re-estimated in every year. Instead, change years define blocks within which selectivity is constant. Let \(\mathbf{C}_f\) be a binary indicator matrix where \(C_{f,y} = 1\) if year \(y\) starts a new selectivity block for fishery \(f\). Within a block, selectivity repeats the values from the block’s initial year:

\[ s_{f,y,a} = \begin{cases} s_{f,y,a} & \text{if } C_{f,y} = 1 \\ s_{f,y-1,a} & \text{if } C_{f,y} = 0 \end{cases} \]

This reduces the number of free parameters from \(Y \times (a_{\max} - a_{\min} + 1)\) to \(B_f \times (a_{\max} - a_{\min} + 1)\) where \(B_f\) is the number of selectivity blocks for fishery \(f\).

4.3 Separable 2D AR1 prior

4.3.1 Marginal AR1 processes

The log-selectivity deviations for each fishery are assumed to follow a separable two-dimensional Gaussian Markov random field (GMRF). The key idea is that temporal (year) and ontogenetic (age) correlation structures are modeled independently, and their joint distribution is constructed via a Kronecker product.

Define two stationary AR1 processes:

Year dimension. For a sequence \(x_1, \ldots, x_Y\):

\[ x_y = \rho_y\, x_{y-1} + \sigma_y\,\eta_y, \qquad \eta_y \sim \mathcal{N}(0,1) \]

where \(\rho_y \in (-1, 1)\) is the year-to-year autocorrelation. The marginal variance is \(\sigma_y^2 / (1 - \rho_y^2)\) and the precision (inverse covariance) matrix for a length-\(Y\) sequence is the tridiagonal matrix \(\mathbf{Q}_y\) with:

\[ [\mathbf{Q}_y]_{ij} = \begin{cases} 1 + \rho_y^2 & i = j,\ 1 < i < Y \\ 1 & i = j = 1 \text{ or } i = j = Y \\ -\rho_y & |i - j| = 1 \\ 0 & \text{otherwise} \end{cases} \]

scaled by \(1 / \sigma_y^2\).

Age dimension. Analogously, for ages \(a_{\min}, \ldots, a_{\max}\):

\[ z_a = \rho_a\, z_{a-1} + \sigma_a\,\nu_a, \qquad \nu_a \sim \mathcal{N}(0,1) \]

with autocorrelation \(\rho_a\) and precision matrix \(\mathbf{Q}_a\) of the same tridiagonal form.

4.3.2 Kronecker product structure

The joint precision matrix for the vectorized year–age deviation \(\text{vec}(\boldsymbol{\epsilon}_f)\) is the Kronecker product:

\[ \mathbf{Q} = \mathbf{Q}_y \otimes \mathbf{Q}_a \]

This encodes the assumption that the year and age correlation structures are separable: the correlation between \(\epsilon_{y,a}\) and \(\epsilon_{y',a'}\) factorizes as

\[ \text{Cor}(\epsilon_{y,a},\, \epsilon_{y',a'}) = \rho_y^{|y - y'|}\,\rho_a^{|a - a'|} \]

The corresponding joint covariance matrix is:

\[ \boldsymbol{\Sigma} = \boldsymbol{\Sigma}_y \otimes \boldsymbol{\Sigma}_a \]

where \(\boldsymbol{\Sigma}_y\) and \(\boldsymbol{\Sigma}_a\) are the marginal AR1 covariance matrices for year and age, respectively.

4.3.3 Marginal scale

The overall marginal standard deviation of any single element \(\epsilon_{f,y,a}\) is

\[ \text{SD}(\epsilon_{f,y,a}) = \frac{\sigma_f} {\sqrt{1 - \rho_{y,f}^2}\;\sqrt{1 - \rho_{a,f}^2}} \]

where \(\sigma_f\) is the innovation scale for fishery \(f\). This quantity (scale in the RTMB code) ensures that the specified \(\sigma_f\) represents the innovation standard deviation while the marginal variance accounts for both autocorrelation parameters.

4.3.4 Log-density

The log-density of the separable GMRF, evaluated by RTMB’s dseparable, is:

\[ \log p(\boldsymbol{\epsilon}_f \mid \rho_y, \rho_a, \sigma) = -\frac{YA}{2}\log(2\pi) + \frac{A}{2}\log|\mathbf{Q}_y| + \frac{Y}{2}\log|\mathbf{Q}_a| - \frac{1}{2}\,\text{vec}(\boldsymbol{\epsilon}_f)^\top \left(\mathbf{Q}_y \otimes \mathbf{Q}_a\right) \text{vec}(\boldsymbol{\epsilon}_f) \]

where \(A = a_{\max} - a_{\min} + 1\) is the number of estimated ages. The Kronecker structure means the log-determinant decomposes as a sum of the individual log-determinants (scaled by dimension), and the quadratic form can be evaluated without materializing the full \(YA \times YA\) precision matrix.

4.4 RTMB implementation

The implementation uses dseparable to compose two dautoreg densities. The dautoreg(x, phi) function evaluates the log-density of an AR1 process with autocorrelation phi, and dseparable(f1, f2) constructs their Kronecker product density.

Show R code
library(RTMB)

get_selectivity_prior <- function(rho_y, rho_a, log_sigma, par_log_sel_fya) {
  n_sel <- length(par_log_sel_fya)
  lp_sel <- numeric(n_sel)
  for (f in seq_len(n_sel)) {
    # Marginal scale: sigma / sqrt((1-rho_y^2)(1-rho_a^2))
    scale <- exp(log_sigma[f]) /
      sqrt(1 - rho_y[f]^2) /
      sqrt(1 - rho_a[f]^2)

    # AR1 density in year dimension
    f1 <- function(x) dautoreg(x, phi = rho_y[f], log = TRUE)
    # AR1 density in age dimension
    f2 <- function(x) dautoreg(x, phi = rho_a[f], log = TRUE)

    # Separable 2D density via Kronecker product
    lp_sel[f] <- -dseparable(f1, f2)(
      par_log_sel_fya[[f]],
      scale = scale
    )
  }
  return(lp_sel)
}

4.4.1 dseparable mechanics

The call dseparable(f1, f2)(X, scale) performs the following:

  1. Treats the matrix X (dimension \(B_f \times A\)) as a 2D random field.
  2. Evaluates f1 along each row (year marginal) and f2 along each column (age marginal).
  3. Combines them via the Kronecker identity to compute the joint log-density without forming the full \(B_f A \times B_f A\) precision matrix.
  4. The scale parameter rescales the entire field so that the marginal standard deviation matches the specified value.

This is efficient because the computational cost scales as \(\mathcal{O}(B_f \cdot A)\) rather than \(\mathcal{O}((B_f \cdot A)^3)\).

4.4.2 Selectivity construction

The get_selectivity function builds the full 3D array (fishery \(\times\) year \(\times\) age) from the estimated log-selectivity blocks:

Show R code
get_selectivity <- function(n_age, max_age, first_yr, first_yr_catch,
                            sel_min_age_f, sel_max_age_f, sel_end_f,
                            sel_change_year_fy, par_log_sel) {
  n_sel  <- nrow(sel_change_year_fy)
  n_year <- ncol(sel_change_year_fy)
  ymin   <- first_yr_catch - first_yr + 1
  sel_fya <- array(0, dim = c(n_sel, n_year, n_age))

  for (f in seq_len(n_sel)) {
    amin <- sel_min_age_f[f] + 1
    amax <- sel_max_age_f[f] + 1
    ipar <- 1
    for (y in ymin:n_year) {
      if (sel_change_year_fy[f, y] != 0) {
        # New selectivity block: exponentiate and mean-normalize
        sel_tmp <- exp(par_log_sel[[f]][ipar, ])
        ipar <- ipar + 1
        sel_fya[f, y, amin:amax] <- sel_tmp / mean(sel_tmp)
        # Extend to plus group if needed
        if (as.logical(sel_end_f[f]) && amax < max_age) {
          for (a in (amax + 1):n_age) {
            sel_fya[f, y, a] <- sel_fya[f, y, amax]
          }
        }
      } else {
        # Carry forward previous block
        sel_fya[f, y, ] <- sel_fya[f, y - 1, ]
      }
    }
  }
  return(sel_fya)
}

4.5 Visualization

Figure 12: Simulated 2D AR1 selectivity surfaces under different autocorrelation settings.
Figure 13: Time-varying selectivity curves at selected years (high autocorrelation scenario).

4.6 Properties and considerations

The separable 2D AR1 formulation has several properties that make it attractive for operational stock assessments:

Parsimony. Three hyperparameters per fishery (\(\rho_y\), \(\rho_a\), \(\sigma\)) control the smoothness of an arbitrarily large year–age surface. The autocorrelation parameters shrink the effective number of free parameters well below the nominal \(B_f \times A\).

Computational efficiency. The Kronecker structure avoids forming or factorizing the full joint precision matrix. Combined with RTMB’s automatic differentiation, gradients of the log-prior are computed at the same \(\mathcal{O}(B_f \cdot A)\) cost as the density itself.

Interpretability. The parameter \(\rho_y\) directly controls how quickly selectivity can change between years (high values produce slow drift, low values allow abrupt shifts), while \(\rho_a\) controls the smoothness of the selectivity curve within any single year.

Integration with block structure. The change-year mechanism can be layered on top: the 2D AR1 prior applies to the matrix of block-specific selectivity patterns, not to every individual year. This reduces dimensionality further while still allowing the prior to borrow strength across blocks.

Source: Time-Varying Selectivity with 2D Autoregressive Structure

4.7 Comparison and Discussion

The four selectivity families evaluated here span a practical gradient from low-dimensional structural assumptions to high-dimensional latent-process flexibility. The logistic model (Section 0.4; Figure 1) provides an interpretable baseline with strong monotonicity assumptions and minimal parameter burden. The double-logistic model (Section 1.5; Table 1, Figure 2, Figure 4) adds biologically plausible dome-shape behavior, but also increases confounding risk with natural mortality and cohort effects when data support is limited (Thompson 1994).

Spline selectivity (Section 2.9; Figure 10; Figure 11) offers useful middle ground: flexible shape estimation with transparent control of roughness via penalization. In review terms, this is a semi-parametric compromise between rigid parametric forms and fully dynamic latent surfaces. The time-varying 2D AR1 approach (Section 3.6; Figure 12; Figure 13) further extends flexibility by allowing coherent year-by-age evolution with explicit process structure, aligning with recent recommendations for autocorrelated selectivity deviations (Linton and Bence 2011; Xu et al. 2019).

For scientific review and operational assessment use, a defensible workflow is:

  1. Start with a parsimonious baseline (logistic or double logistic) and diagnose fit deficiencies in composition residuals and retrospective patterns.
  2. Introduce additional flexibility (spline or time-varying structures) only when diagnostics indicate systematic lack of fit and data support richer structure.
  3. Compare candidate models with complementary evidence, including information criteria, predictive checks, and sensitivity of management quantities (Punt, Hurtado-Ferro, and Whitten 2013; Wilberg and Bence 2008).
  4. Report regularization choices and identifiability diagnostics explicitly, because inferred selectivity dynamics are conditional on prior/penalty assumptions (Martell and Stewart 2014; Privitera-Johnson, Methot, and Punt 2022).

Overall, the central trade-off is not whether one curve family is universally “best”, but whether the assumed latent structure is commensurate with the available information and management objectives.

References

Linton, Brenden C., and James R. Bence. 2011. “Catch-at-Age Assessment in the Face of Time-Varying Selectivity.” ICES Journal of Marine Science 68 (3): 611–25. https://doi.org/10.1093/icesjms/fsq173.
Martell, Steven J. D., and Ian J. Stewart. 2014. “Towards Defining Good Practices for Modeling Time-Varying Selectivity.” Fisheries Research 158: 84–95. https://doi.org/10.1016/j.fishres.2013.11.001.
Methot, Richard D., and Chantell R. Wetzel. 2013. “Stock Synthesis: A Biological and Statistical Framework for Fish Stock Assessment and Fishery Management.” Fisheries Research 142: 86–99. https://doi.org/10.1016/j.fishres.2012.10.012.
Millar, R. B., and R. J. Fryer. 1999. “Estimating the Size-Selection Curves of Towed Gears, Traps, Nets and Hooks.” Reviews in Fish Biology and Fisheries 9: 89–116. https://doi.org/10.1023/A:1008838220001.
Privitera-Johnson, Kristin, Richard D. Methot, and André E. Punt. 2022. “Towards Best Practice for Specifying Selectivity in Age-Structured Integrated Stock Assessments.” Fisheries Research 248: 106248. https://doi.org/10.1016/j.fishres.2022.106248.
Punt, André E., Felipe Hurtado-Ferro, and Athol R. Whitten. 2013. “Model Selection for Selectivity in Fisheries Stock Assessments.” Fisheries Research 141: 1–12. https://doi.org/10.1016/j.fishres.2013.02.008.
Thompson, Grant G. 1994. “Confounding of Gear Selectivity and the Natural Mortality Rate in Cases Where the Former Is a Nonmonotone Function of Age.” Canadian Journal of Fisheries and Aquatic Sciences 51 (12): 2654–65. https://doi.org/10.1139/f94-265.
Wilberg, Michael J., and James R. Bence. 2006. “Performance of Time-Varying Catchability Estimators in Statistical Catch-at-Age Analysis.” Canadian Journal of Fisheries and Aquatic Sciences 63: 2275–85. https://doi.org/10.1139/f06-111.
———. 2008. “Performance of Deviance Information Criterion Model Selection in Statistical Catch-at-Age Analysis.” Fisheries Research 93: 212–21. https://doi.org/10.1016/j.fishres.2008.04.010.
Wileman, D. A., R. S. T. Ferro, R. Fonteyne, and R. B. Millar. 1996. “Manual of Methods of Measuring the Selectivity of Towed Fishing Gears.” ICES Cooperative Research Report 215. Copenhagen: International Council for the Exploration of the Sea (ICES). https://ices-library.figshare.com/articles/report/Manual_of_methods_of_measuring_the_selectivity_of_towed_fishing_gears/18624446/1/files/33403508.pdf.
Xu, Haikun, James T. Thorson, Richard D. Methot, and Ian G. Taylor. 2019. “A New Semi-Parametric Method for Autocorrelated Age- and Time-Varying Selectivity in Age-Structured Assessment Models.” Canadian Journal of Fisheries and Aquatic Sciences 76: 268–85. https://doi.org/10.1139/cjfas-2017-0446.